Online implicit agent modelling
نویسندگان
چکیده
The traditional view of agent modelling is to infer the explicit parameters of another agent’s strategy (i.e., their probability of taking each action in each situation). Unfortunately, in complex domains with high dimensional strategy spaces, modelling every parameter often requires a prohibitive number of observations. Furthermore, given a model of such a strategy, computing a response strategy that is robust to modelling error may be impractical to compute online. Instead, we propose an implicit modelling framework where agents aim to estimate the utility of a fixed portfolio of pre-computed strategies. Using the domain of heads-up limit Texas hold’em poker, this work describes an end-to-end approach for building an implicit modelling agent. We compute robust response strategies, show how to select strategies for the portfolio, and apply existing variance reduction and online learning techniques to dynamically adapt the agent’s strategy to its opponent. We validate the approach by showing that our implicit modelling agent would have won the heads-up limit opponent exploitation event in the 2011 Annual Computer Poker Competition.
منابع مشابه
A Robust Feedforward Active Noise Control System with a Variable Step-Size FxLMS Algorithm: Designing a New Online Secondary Path Modelling Method
Several approaches have been introduced in literature for active noise control (ANC)systems. Since Filtered-x-Least Mean Square (FxLMS) algorithm appears to be the best choice as acontroller filter. Researchers tend to improve performance of ANC systems by enhancing andmodifying this algorithm. This paper proposes a new version of FxLMS algorithm. In many ANCapplications an online secondary pat...
متن کاملImplicit iteration approximation for a finite family of asymptotically quasi-pseudocontractive type mappings
In this paper, strong convergence theorems of Ishikawa type implicit iteration process with errors for a finite family of asymptotically nonexpansive in the intermediate sense and asymptotically quasi-pseudocontractive type mappings in normed linear spaces are established by using a new analytical method, which essentially improve and extend some recent results obtained by Yang ...
متن کاملModelling Implicit Communication in Multi-Agent Systems with Hybrid Input/Output Automata
We propose an extension of Hybrid I/O Automata (HIOAs) to model agent systems and their implicit communication through perturbation of the environment, like localization of objects or radio signals diffusion and detection. To this end we decided to specialize some variables of the HIOAs whose values are functions both of time and space. We call them world variables. Basically they are treated s...
متن کاملGOLDSMITHS Research Online Book Section Luck, Michael and d'Inverno, Mark Engagement and cooperation in motivated agent modelling
The title of this paper suggests two distinct aspects of the models that we propose and consider. The rst of these is the modelling of other agents by motivated agents. That is to say that the act of modelling is itself motivated and constrained by the agent doing that modelling. The second aspect is that all such models will also be of motivated agents. It is not su cient merely to know what o...
متن کاملOptimal adaptive leader-follower consensus of linear multi-agent systems: Known and unknown dynamics
In this paper, the optimal adaptive leader-follower consensus of linear continuous time multi-agent systems is considered. The error dynamics of each player depends on its neighbors’ information. Detailed analysis of online optimal leader-follower consensus under known and unknown dynamics is presented. The introduced reinforcement learning-based algorithms learn online the approximate solution...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013